A Fast Inference Vision Transformer for Automatic Pavement Image Classification and Its Visual Interpretation Method
نویسندگان
چکیده
Traditional automatic pavement distress detection methods using convolutional neural networks (CNNs) require a great deal of time and resources for computing are poor in terms interpretability. Therefore, inspired by the successful application Transformer architecture natural language processing (NLP) tasks, novel method called LeViT was introduced asphalt image classification. consists layers, transformer stages where Multi-layer Perception (MLP) multi-head self-attention blocks alternate residual connection, two classifier heads. To conduct proposed methods, three different sources datasets pre-trained weights based on ImageNet were attained. The performance model compared with six state-of-the-art (SOTA) deep learning models. All them trained transfer strategy. Compared to tested SOTA has less than 1/8 parameters original Vision (ViT) 1/2 ResNet InceptionNet. Experimental results show that after training 100 epochs 16 batch-size, acquired 91.56% accuracy, 91.72% precision, recall, 91.45% F1-score Chinese dataset 99.17% 99.19% German dataset, which is best among all Moreover, it shows superiority inference speed (86 ms/step), approximately 25% ViT 80% some prevailing CNN-based models, including DenseNet, VGG, ResNet. Overall, can achieve competitive fewer computation costs. In addition, visualization combining Grad-CAM Attention Rollout analyze classification explore what been learned every MLP attention block LeViT, improved interpretability model.
منابع مشابه
A Lightweight Inference Method for Image Classification
We demonstrate a two phase classification method, first of individual pixels, then of fixed regions of pixels for scene classification—the task of assigning posteriors that characterize an entire image. This can be realized with a probabilistic graphical model (PGM), without the characteristic segmentation and aggregation tasks characteristic of visual object recognition. Instead the spatial as...
متن کاملAnalogical Inference in Automatic Interpretation
We present findings suggesting that analogical inference can play a role in the fundamental processes involved in automatic comprehension and interpretation. Participants were found to use information from a prior relationally similar example in understanding the content of a currently encoded example. Further, in doing so they were sensitive to structural mappings between the two instances, ru...
متن کاملAN-EUL method for automatic interpretation of potential field data in unexploded ordnances (UXO) detection
We have applied an automatic interpretation method of potential data called AN-EUL in unexploded ordnance (UXO) prospective which is indeed a combination of the analytic signal and the Euler deconvolution approaches. The method can be applied for both magnetic and gravity data as well for gradient surveys based upon the concept of the structural index (SI) of a potential anomaly which is relate...
متن کاملVisual Image Interpretation
Often the first step in a remote sensing change detection study investigating coastal dynamics is to delineate the actual coastline from theavailable images. This can often be a difficult task, partly because of the uncertainty over what is and is not the coastline. Whilst it may appear to be common sense to use the water line (where sea meets land) as the best indicator it may not be that easy...
متن کاملA Fast, Robust, Automatic Blink Detector
Introduction “Blink” is defined as closing and opening of the eyes in a small duration of time. In this study, we aimed to introduce a fast, robust, vision-based approach for blink detection. Materials and Methods This approach consists of two steps. In the first step, the subject’s face is localized every second and with the first blink, the system detects the eye’s location and creates an ope...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Remote Sensing
سال: 2022
ISSN: ['2315-4632', '2315-4675']
DOI: https://doi.org/10.3390/rs14081877